Inconstancy of finite and infinite sequences 



Jean-Paul Allouche a '*, Laurence Maillard-Teyssier b ' 

^CNRS, Institut de Math., Universite P. et M. Curie, Case 189, 4 Place Jussieu, F-75252 Paris Cedex 05, France 
b RTE, DMA, Immeuble he Colbert, 9 rue de la Porte de Buc, BP 561, 78005 Versailles Cedex, France 



Abstract 

In order to study large variations or fluctuations of finite or infinite sequences (time series), we bring to light 
an 1868 paper of Crofton and the (Cauchy-)Crofton theorem. After surveying occurrences of this result in 
the literature, we introduce the inconstancy of a sequence and we show why it seems more pertinent than 
other criteria for measuring its variational complexity. We also compute the inconstancy of classical binary 
sequences including some automatic sequences and Sturmian sequences. 
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The voyage of the best ship is a zig-zag line of a hundred tacks. 
See the line from a sufficient distance, and it straightens itself to 
the average tendency... (Ralph Waldo Emerson, Emerson Essays, 1899) 



1. Introduction 



How is it possible to define and to detect large variations or fluctuations of a sequence (with possible 
applications to the [discrete] time evolution of biological, financial, musical phenomena and so on). The usual 
approach is based on computing the distance of the associated piecewise affine function to the corresponding 
linear regression line, i.e., on computing the residual variance. But this quantity somehow describes total 
distance to "regularity" , and says nothing about possibly large local fluctuations: for example, it may not 
discriminate between an exponentially growing function and a fractal-like "chaotic" (disordered) curve. In 
particular one should remember that dictionaries defining "fluctuation" use words with a similar meaning 
among which "wavering" , "unsteadiness" , "vacillation" , "erraticness" , "variability" , etc. 

We suggest here to bring to light - especi ally for applications to sequences - a paper of Crofton dated 



1868 [20( (see also the papers of Cauchy [13|, LLJ] and the papers of Steinhaus [58| and of Dupain, Kamae 
and Mendes France 22j ) . Crofton studies the average number of intersection points of a curve with random 
straight lines. But this average number can be thought of as a measure of the fluctuations of the curve. 
Namely, for a straight line or a curve "looking like a straight line" , this average number is equal to 1 , while 
it has a very large value for a "very complicated" curve. Following this idea, we propose a measure of large 
variations of a sequence and we compare it with the residual variance. Conversely, this measure will allow 
us to decide whether a sequence is "more complicated" than another in cases where the visual aspect does 
not suffice to suggest an intuitive answer. We will also show that this measure can be applied to infinite 
sequences satisfying some technical condition (in particular certain automatic sequences as well as Sturmian 
sequences; see, e.g., [H) to describe their "complexity". 



* Corresponding author 

Email addresses: alloucheOmath.jussieu.fr (Jean-Paul Allouchc), Laurence.Teyssier-MaillardSrte-france.com 

(Laurence Maillard-Teyssier) 



Preprint submitted to Elsevier 



February 1, 2011 



As will be recalled, the ideas of Cauchy and Crofton were already used in various contexts: one of our 
purposes is to insist on their usefulness for measuring the complexity of discrete phenomena, as a compromise 
between measuring intensity, time and consecutive repetitions. These ideas will be applied in a subsequent 
paper (see [6l|) to fluctuations of biological parameters, e.g., the weight, or the Quetelet mrfea0, often called 



the BMI (Body Mass Index; see, e.g., [52|, [8|, [55|, [54, 53]) for children: are "large fluctuations" of the BMI risk 
factors for cardiovascular diseases in relation with the metabolic syndrome! This question was addressed with 
other tools in [6(| (see also the references therein) . We also aim to try to apply this measure of fluctuations 
to other questions, e.g., analyzing fluctuations of the stockmarket, and quantifying the "smoothness" of 
musical themes. 

2. Defining the Inconstancy of a curve 

A possible approach for describing large variations or large fluctuations of a curve is to "compare" it 
with a straight line. More precisely we can count the number of intersection points of random straight lines 
with the given curve: if this number is small on average, the curve behaves roughly as a straight line; if 
this number is large, the curve is "complicated" . Is there an "easy" way to compute this number? The 
Cauchy- Crofton theorem answers the question. 

2.1. The Cauchy- Crofton theorem 

Consider a plane curve T. Let £(r) denote the length of T and let S(T) denote the perimeter of the closed 
curve forming the edge of the convex hull of T. Let £l(T) be the set of straight lines which intersect L. Any 
line can be defined as the set of (x, y) such that xcosO + y sin 9 — p = 0, where 9 belongs to [0,7r) and p is 
a positive real number. A straight line is therefore completely determined by (p,9). Letting // denote the 
Lebesgue measure on the set {(p, 9), p>0, 9 g [0,7r)}, the average number of intersection points between 
the curve T and a line in Q is defined by 

Af(T)-.= [ B(rnD) dpd0 



/flesi(r) M(^( r )) 
The following result can be found in [2Ct p. 184-185]; see also the papers of Cauchy (HQ]. 



Theorem 2.1 (Cauchy-Crofton). The average number of intersection points between the curve T and the 
straight lines in fi satisfies 

2£(T) 



5(T) 



Remark 2.2. In his paper, Crofton speaks of "Local or Geometrical Probability"; he writes about Prob- 
abilities, "The rigorous precision, as well as the extreme beauty of the methods and results... the subtlety 
and delicacy of the reasoning...", and he quotes Laplace: "ce calcul delicat". Crofton's result is explained in 
Steinhaus' paper [Hij . It is presented in an illuminating way with several examples in the paper of Dupain, 
Kamae, and Mendes France [22j |: these authors studied the notion of entropy of a curve and of temperature 



of a curve introduced by Mendes France in 38J . Note that the occurrence of the number 2 in the numerator 
can be understood by considering the case where T is a segment: the average number of intersection points 
is equal to 1, while the perimeter of the convex hull of the segment is twice the length of the segment (why 
twice? go back to the definition ["closed curve..."] or think of the case where the segment is replaced by a 
thin rectangle whose width tends to zero). 

Remark 2.3. The reader will have noted that Crofton's approach has much to do with the famous Buffon 



needle problem [ill, p. 100-104] also known as the Buffon-Laplace needle problem; see [3l|, p. 359-360]. The 
area of this type of result is known as "Integral Geometry" . This terminology seems to have been introduced 
by Blaschke in his "Vorlesungen iiber Integralgeometrie" [t| [l(| ■ More recent references are the book of 



1 In |52H Quetelet asserts that weights vary like heights squared for adults but more like (heights) 5 / 2 for children (see p. 
52-53, and p. 61), while the "simplified" definition of the BMI is the ratio of the weight by the height squared. 
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Santalo 56 1, and the forthcoming book of Langevin 33 1 (see also [33]). An interesting review of the books 
of Blaschke and of the 1936 edition of the book of Santalo is j45[. A nice exposition of the (proof of the) 
theorem of Cauchy-Crofton, where the curve is only supposed to be rectifiable, can be found in the paper of 
Ayari and Dubuc [6j . We also recommend for a first approach the texts of Mendes France 42 1 and of Teissier 
59]. Note that the Crofton theorem is also (and more correctly) called the Cauchy-Crofton theorem in the 
literature. 

Remark 2.4. Using the theorem of Cauchy-Crofton to define a measure of complexity of a curve was first 
suggested by Mendes France [IF 



page 92]; also see [371 l38j|. It was also proposed later, e.g., in [15[ where 
the name "folding index" is used. Also note that the Crofton formulas in [20 ] are used frequently in many 
fields. These include complex motor behaviour in human movements [l6T ] (also see 17, HI), study of human 
blood and transfusion 16211 . simulation of gravitational evolution [57j ]. anisotropies of the secondary cosmic 
microwave background [25| , grain size distribution analysis for polycrystalline thin films [l9| , image analysis 
of crystalline agglomerates |4{|, measurement of convolution in cotton fibers (28[, all applications of LIS 
(Line- Intercept Sampling), e.g., to the statistical analysis of vegetation or wildlife, see for example 16311 and 
the references therein (in particular (30| in the references below), spatial analysis of urban maps [24] ■ in 
a discussion about examples of information processing coming from neurophysiology, cognitive psychology, 
and perception [48J, pp. 1182-1185], and even relations between art and complexity 43j (also see 46|, |47J). 



2.2. The inconstancy of a curve 

The theorem of Cauchy-Crofton suggests the following definition. 

Definition 2.5. Let T be a plane curve of length £{T) and such that the perimeter of its convex hull is equal 
to 8(T). The inconstancy of the curve T, denoted T(T), is defined by 



I(T) 



2£(T) 
W 



Remark 2.6. The above definition and the Cauchy-Crofton theorem show that the inconstancy of the union 
of two curves is at most the sum of the inconstancies of these curves, that the inconstancy of a curve is equal 
to the inconstancy of its translated, rotated or homothetic curve, etc. 



3. Comparison with other criteria 

Other criteria for measuring fluctuations of a discrete curve can be found in the literature for real (e.g., 
biological) phenomena: qualitative classification with predetermined cut-off points, maximal values, residual 
variance, etc. (see, e.g., the discussion in (6p| . pp. 316-317] for weight fluctuations). By oversimplifying most 
of the various definitions, one could say that they aim to measure the "distance" between the considered 
curve and a straight line, but this distance can be computed globally or locally. We recall the definition of 
regression line, of residual variance, and of mean square error. 

Definition 3.1. Let (xi, 2/j)j=i,2 ) ...,n he a family of n > 3 points. Their regression line is the straight line 
that minimizes the sum of squares of distances from the (xi,?/i)'s to it. Letting x = (J2i<i<n x i)/ n an< ^ 
y = (J2i<i< n Vi)l n denote the averages of the XiS and of the y^s, the equation of the regression line is 

v « V >i u " Ei<i< n (xi-x)( yi -y) A t- 

Y — aX + b, where a = -. —-^ is the correlation, and b — y — ax. 

The MSE (i.e., mean square error) and the RMSE (i.e., root mean square error) of the (xi, j/i)'s are defined 

by 

MSE:=- — - (yi-ax l -b) 2 and RMSE := VMSE. 

l<i<n 

The mean square error is sometimes called residual variance. 
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We also introduce some notation. 

Definition 3.2. Let n be a positive integer. We define T(a%, 0,2, ■ ■ ■ , a n ) to be the union of the n segments 
(0,0) — (l,ai), (l,Oi) — (2,02), ... (n— l,a n _i) — (n,a n ). (Note that we have (n+1) points, and that, without 
loss of generality, we suppose that the curve begins at the origin.) 

3.1. Why is MSE not satisfactory to measure fluctuations? 

In this section we show two curves having same length: one is "fluctuating" , the other increases quickly, 
but their residual variances are both equal to 6; see Figure [T] Note that when we say that the first curve 
is more "fluctuating" than the second one, it means for example that for a variation of weight or of BMI, 
the first curve is really fluctuating, while the second one just shows some (possibly quick) growth (also see 
Remarks 13.61 and the beginning of Section 14.31 below) . 




Figure 1: Same MSE 

3.2. Comparing MSE and inconstancy 

Are residual variance and inconstancy of a curve comparable? We will prove that this is not the case, 
even for very simple curves, thanks to two easy lemmas. 

Lemma 3.3. Let lZ{Y(a\, 02)) be the residual variance of the curve r(ai,a2). Then 

K{Y{a 1 ,a 2 )) = 

6 

Proof. The computation is straightforward. The linear regression straight line is parallel to (0,0) — (2,02), 
and it contains the center of gravity of the triangle (0,0), (l,ai), (2,02). Or simply compute from Defini- 
tion [3HJ x = 1, y = (a% + a2)/3, a = 02/2, and b = (2ai — a2)/6, hence lZ(T(ai, 02)) = (2ai — a2) 2 /6. 
□ 
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Lemma 3.4. Let T(a,i,a,2) be the curve defined as the union of the two straight line segments (0,0) — (l,&i) 
and (ljOi) — (2,02). Then, X(T(ai,a2)), the inconstancy ofT(ai,a2), is given by 

o 

I(r(ai,a 2 )) = 



V«2 2 + 4 



Vai 2 + 1 + V>2 - ai) 2 + 1 

Proof. The proof is again straightforward. The length of r(a l7 a 2 ) and the perimeter of the convex hull of 
r(ai,a2) are given respectively by 



\ja\ + l + y/(a 2 - ai) 2 + 1 and ^ a\ + 1 + - ai) 2 + 1 + ya 2 +4. 

We can now state the non-comparability of residual variance and inconstancy. 



□ 



5 T 



2 - 



1 - 




Figure 2: Comparing residual variance and inconstancy 

Proposition 3.5. Residual variance and inconstancy of a curve are not comparable. More precisely, there 
exist four curves Ti, i = 1,2,3,4, (see Figure\^and the proof below) such that, iflZ(Ti) andT(Ti) are their 
residual variances and inconstancies, then the following inequalities hold: 

Kpi) < K(r 2 ) < K{r 3 ) < ft(r 4 ) 
I(T 4 ) < T(Y 2 ) < j(ri) < I(T 3 ). 

Proof. Using Lemmas 13.31 and I3T41 above, we get the residual variances 1Z(Ti) and inconstancies of the 

following curves 



r a 


:= T(1,0) 




2 

3 


w 0.67 


I(Ti) 


r 2 


:= T(0,3) 


ft(r 2 ) = 


3 
2 


w 1.50 




r 3 


:= T(2,0) 




8 
3 


« 2.67 


I(T 8 ) 


r 4 


:= T(0,5) 


ft(r 4 ) = 


25 
6 


w 4.17 


j(r 4 ) 



2V2 
1 + V2 

2+2\/T0 
l + v / T0+v / 13 
2yii 
14V5 

2+2\/26 
l + v^+v/SO 



1.17 
1.07 
1.38 

1.06 □ 



5 



Remark 3.6. Comparing 2^2) and shows again that "fluctuating" is not the same as "growing". 

More generally, with the notation of Lemma [3.41 above, looking at X(r(0, x)), shows that l(r(0, 0)) = 1 = 
lim x _ ) . 00 Z(r(0, x)). When x varies from to 00 the quantity l(r(0, x)) increases from 1 to a small value > 1 
then it decreases back to 1. 

Remark 3.7. There are other quantities that also "measure" the fluctuations of a curve. For example, 
keeping the notations of Definition 13. f I the total variation is defined as the mean of (y, — y) 2 , i.e., as 
(EkkuI^ — V) )/ n 'i the maximal distance is defined as maxi<K„ \yi — axi — b\. The reader can easily 
compute these quantities for the curve r(oi, 02) and check that they are not comparable to the inconstancy 
of F(ai, a 2 ). 

rr . 1 • +• 2 (°i + a l - a i«2) . , . |2oi-o 2 | 
lotal variation: Maximal distance: 



4. Pertinence of the use of inconstancy: simple arguments 

4-1- A single fluctuation 

Taking again the example in the previous section of a curve consisting of two straight line segments, let 
us vary the value a%, say x := a%, and fix 02 = a (see Figure[3]). The inconstancy T(T(x, a)) is thus given by 

1 

I(T(x,a)) — 



yjx 2 + f + y/(a - x) 2 + f 

This map x — > X(T(x, a)) is increasing for x > a/2, which is in agreement with what a "fluctuation" should 
be. 




Figure 3: Varying the intermediate value 



It is clear that I(T(x, a)) = l(T(a — x,a)), which shows that the line x = a/2 is a symmetry axis. 
In other words, "exchanging" the two segments, more precisely replacing ((0,0) — (1,2:)), ((l,x) — (2, a)) by 
((0, 0) — (1, a — x)), ((1, a — x) — (2, a)), does not change the inconstancy (see Figure 2]). Of course this is a 
necessary condition for a fluctuation criterion. 

It is easy to show that Z(r(a/2,a)) = 1 (no fluctuation) and \\m x ^ +00 T{T{x 1 a)) = 2 (when x is large, 
the value of x is not really important, the inconstancy is close to 2). We also have that (X(T(x, a)))' = if 
and only if x — a/2. In particular the graph of the function X(T(x,a)) has the aspect shown in Figures [5] 
and [5] 

We note that the curve is "flat" in the neighborhood of a/2, or even for x € (0, a); see Figure [5] This 
means that the inconstancy (I(T(x, a))), which is equal to 1 for x = a/2, remains close to 1 when the two 
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Figure 4: Symmetry 




a/2 a 



Figure 5: Graph of X(T(x, a)) 

slopes of the curve have the same sign, while it is larger when the signs of the slopes are opposite, which 
correctly describes what a fluctuation should be (the MSE does not have this property) ; see Figure [7] Also 
the inconstancy (l(T(x,a))) tends quickly to 2 when a is small: see Figure [6l 

4-2. General remarks 

If we look more generally at the inconstancy of r(ai, ct2, . . . , a„), what will clearly matter for its size is the 
sequence of slopes: growth and signs of consecutive terms are crucial characteristics of the sequence, which 
corresponds to the intuitive idea of "fluctuation" . Of course we always have the straightforward bounds 

1 < X(r(ai, a 2 , . . . , a n )) < n 

(count the possible number of intersection points of T(ai, 02, . . . , a n ) with a random straight line and apply 
Theorem El]) . 
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Figure 6: Varying a in the graph of X(T(x, a)) 



Conversely the inconstancy may be used to discriminate between curves, i.e., to decide whether a curve 
fluctuates more than another, when the "visual aspect" does not sufflce to assert an intuitive answer. We 
give two examples. 



4--S. Fluctuations of curves with four points 

In Figure[5]inconstancies permit to discriminate between "less fluctuating" and "more fluctuating" curves, 
though there is no visual evidence of which curve fluctuates more. It is interesting to note that the maximum 
of the function is not really taken into account, only the variations count (look, e.g., at the two examples 
with inconstancy 1.58 in Figure [8]). 

4-4- A case where inconstancy does not discriminate 

The lengths and inconstancies of the two curves r(v / 3, \/3,0) and ^2^6/5,4^6/5,0) ( see Figure HJ are 
the same. 



5. Inconstancy of sequences 

Inconstancy of (finite or infinite) sequences can be defined in a straightforward way from what precedes. 

Definition 5.1. Let (u n )o< n <N be a finite sequence of real numbers, with uq — say. Let T n be the union 
of the straight line segments (0, 0) — (1, u%), (1, u{) — (2, U2),.--,(n — 1, u n -i) — (N, un), then the inconstancy 
of (u n )o< n < N is defined by 
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Figure 7: Signs of slopes 



Let (u n ) n >o be an infinite sequence of real numbers, with uq — say. Then the inconstancy of (u n ) n >o is 
defined by 

I{{u n )n>o) ■= limsupI((u„) < n <Ar) (or lim I((u n ) < n <jv) if the limit exists). 

JV-s-oo N-^oo 

The inconstancy of an infinite sequence depends in particular of how long and frequently the sequence 
levels off: this is particularly clear for binary sequences as shown in Theorem 15.21 below. 

Theorem 5.2. • (i) Let (u n )o<n<N be a finite sequence taking two values and h > 0, with uq = 0. 
Let a > I be the index such that uq = u\ = . . . = u a -\ = and u a ^ 0. In other words a is the 
length of the longest initial string of 's. Analogously let (3 be the length of the longest final string of 
0's. If ft = 0, let 7 > be the largest index such that u 7 = 0. Let Aoo, Nhh, Noh, Nho be respectively 
the number of blocks of the form 00, hh, Oh, hO in the sequence. Then 



2 Nqq + Nhh + ( V 1 + h 2 ) (JV oh + Nho ) . f Q 

T(( » s_J v 7 /! 2 + a 2 + AT - a - §_ + ^/t 2 + (3 2 + JV 

'" ln " VJ ' , M.o + A/^ + (VTT7?)(AA , + Mm) . Q 
V/i 2 + a 2 + N - a + yjh 2 + {N - 7 ) 2 + 7 

(m,) Let (M n )„>o 6e an infinite sequence taking two values and h > 0, iirci/i uo = 0. W^e make the 
assumption that the frequencies of occurrences of the blocks 00, hh, Oh, hO in the sequence exist and 
are respectively equal to Too, Fhh,Foh,Fm- Then 



I{{u n ) n >o) = Too + Fhh + Wl + h 2 ){Foh + Fho) = 1 + (aA + h? - \){T oh + F m ). 

Similarly let (u„)„>o be an infinite sequence taking only finitely many real values, and let H be this 
set of values. We make the assumption that the frequencies of occurrences of all length-1 blocks jj' 
(j, j' € H) exist and are respectively equal to Tjy . Then 

l((«n)«>0) = E^ + E (Vl + (i'-j) 2 )(^'+^) 

= 1+ E (Vi + (i'-i) 2 -i)fe+^'i)- 

j,j'eH, j<j' 
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Figure 8: Inconstancy discriminates between fluctuations 



Proof. First let (u„)o<n<iv be a finite sequence taking two values and h > 0. Let a > 1 be the index such 
that tig = tii = ... = u a-i = and u a =/= 0. In other words a is the length of the longest initial string of 
0's. Analogously let /3 > be the length of the longest final string of 0's. Finally, if f3 = 0, let 7 > be 
the largest index such that u 7 = 0. It is almost immediate that the convex hull of the curve consists of 
the four straight line segments ((0, 0)— {a, h)), ((a, h) — (N - f3, h)), {{N - [3, h)—{N, 0)), ((0, 0)— (N, 0)) if 
P > 0, and ((0,0)— {a, h)), {{a, h)—(N, h)), ((0, 0)— (7, 0)), ((7, 0) — (N, h)) if (3 = (there are only three 
segments if 7 = 0, which implies a = 1). Hence 

s(r xf V fe 2 + a 2 + N - a - + y//i 2 + /3 2 + N if /3 > 0; 
1 Wj 1 V^ 2 + « 2 + N - a + ^h 2 + (N - 7 ) 2 + 7 if /3 = 0. 

while the length of the curve is 

£(T N ) =Af 00 +Afhh + (Vl + h 2 )(Af 0h +Af h0 )- 
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Figure 9: Same inconstancy 



This gives the first part of the theorem, namely 

I((u n )o<n<N) - 



2 Noo+Afhh + {Vl + h 2 )(Af oh +Af h o) if/3>0 . 
Vfe 2 + a 2 + TV - a - P + y/fe 2 + /3 2 + TV" 



„ -A/bo + N hh + (Vl + h 2 )(M Qh + A6 l0 ) j( . ; = () 

y/h 2 + a 2 + N - a + ^Jh 2 + (N - 7 ) 2 + 7 

In order to prove the second part of the theorem, we will directly address the case of a sequence (u n ) n >o 
taking any finite number of values (the proof is simpler than our original one, thanks to a remark of one of 
the referees). The length of the curve T n clearly is 

E y/i + U'-j) 2 Mi-+Mi-i)- 

ji 11 jj'eH, j<j> 

The perimeter of the convex hull of T n satisfies 



N 2 +u 2 N < <5(r„) <2N + M N , where M N := max{u„, < n < N} 

(the inequality on the left is due to the fact that the perimeter is larger than twice the distance between 
(0,0) and (N,un); the right inequality comes from the fact that the length of the convex hull is less than 
the perimeter of the rectangle (0,0) — (0, Mjv) — (N,Mn) — (TV, 0)). Since the sequence (u„)„> takes only 
finitely many values, this shows that 

% JV ) = 2iV + 0(l). 

Hence 

X((«n)n>0) = E T H + E NA + tf'-jWii' + 
jeH j,j'€H, j<j' 
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But 

jeH j,j'eH, j<j> 

hence the result. □ 



6. Computing the inconstancy of classical sequences 

Using Theorem 15.21 we see that the inconstancy of an infinite binary (0, resequence must belong to the 
interval [1, y/2]. Bound 1 is reached if and only if Joi + J'w = 0, i.e., Fq\ = J-\q = 0. Bound \/2 is reached 
when J-qi + = 1, i.e., J-qq + J~n = 0, i.e., J-qq = J-±i = 0. Let us compute the inconstancy of some 
classical binary sequences. 

6.1. Periodic sequences 

The sequence (O^l) 00 = (00. ..01)°° (periodic of period (d + 1), where the period pattern consists of d 
symbols followed by one symbol 1). It is easy to compute .Foo = j^j, J"n = 0, Tqx = J-'iq = Hence 

koad - d -j^- 

In particular, I((01)°°) = v2 = 1.414... while Z((0 d l)°°) tends to 1 when d tends to infinity: this corresponds 
to the fact that the curve becomes more and more flat when d increases. The case d = 1 is somehow the 
worst case among periodic and nonperiodic binary sequences in terms of levelling off (or flatness). 

6.2. Random sequences 

A random sequence of 0's and l's. For almost all binary sequences we have Fqq = J-\\ = J-q\ = J-\q = \. 
Hence if (r„)„> is "a random sequence" of 0's and l's, then 

A(r n )n>o) = = 1-207... 

6.3. Some automatic sequences 

We first recall a few notions of combinatorics on words; see, e.g. A finite set is called an alphabet. Its 
elements are called letters. For an alphabet A, we let A* denote the free monoid spanned by A and equipped 
with the concatenation. Elements of A* are called words on A] the length of the word aia2-..a n , with a, G A, 
is n. Homomorphisms of monoids are called morphisms. A morphism from A* to B* is determined by the 
images of the letters in A. It is called uniform if the images of all letters have the same length. The transition 
matrix of a morphism a : A* — > B* counts the number of times the letter bj in B occurs in a(ai). Finally a 
sequence is called automatic if it is the pointwise image of a fixed point of a nontrivial uniform morphism. 

- Recall that the Thue-Morse sequence with values and 1 can be defined as the fixed point beginning 
with of the morphism —¥ 01, 1 — > 10 (see, e.g., [J]): it is the most famous example of automatic sequences 
(see, e.g., [H). The first few terms of the Thue-Morse sequence (m„)„>o are 

011010011001011010 ... 

The frequencies of occurrences of blocks of length 2 are given by J-"oo = = \ and Fq\ = J-\o = -|: this is 
a classical exercise that involves the morphism on four letters defined by a —¥ ab, b — > ca, c — > cd, d — > ac. 
An alternative proof consists of noting that the sequence ((m„ + m n +i) mod 2)„>o is the period doubling 
sequence, i.e., the fixed point of the morphism 1 —> 10, —> 11; the sum of frequencies of the blocks 01 and 
10 in the Thue-Morse sequence is thus the frequency of l's in the period doubling sequence which is easily 
seen to be 2/3 (look at the transition matrix of the morphism 1 — > 10, — > 11). Hence 

I((m„)„>o) = 1 + „ 2 ^ = 1.276... 
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Note that the "high" value of this inconstancy is related to the absence of long strings of O's or of l's: namely 
the Thue-Morse sequence does not contain the blocks 000 and 111. 

- The Shapiro- Rudin sequence (r„)„>o with values and 1 can be defined as the sequence of parities 
of the number of (possibly overlapping) ll's in he binary expansions of the integers 0, 1, 2, . . . , n . . . (see, 
e.g., [5]). It is clear that the sum of frequencies of occurrences of the blocks 01 and 10 is the frequency of 
occurrences of the letter 1 in the sequence (r' n ) n >o defined by r' n := (r„ + r„+i) mod 2. This last sequence 
is easily seen to be the pointwise image under the map a — > 0, b — > 0, c — > 1, d — > 1 of the infinite fixed 
point of the morphism a — » ab, b — » cd, c — » ad, d — > cb. (Hint: prove that both this pointwise image and the 
sequence (r' n ) n > satisfy the recursive relations r' 4n — 0, r' 4n+1 = r' 2n , r' in+2 = 1, r' in+3 = 1 + r' 2n+1 mod 2, 
with r' — 0. From this it is straightforward that the frequency of occurrences of 1 in the sequence (r' n ) n >o 
is equal to 1/2. Hence 

Z((r n )„>„) = = 1.207... 

which is the same inconstancy as for a random sequence. 

- The (regular) paperfolding sequence (z n ) n >o with values and 1 can be defined by z^ n = 0, z^ n+ \ = 1, 
Z2n+i = z n . Reasoning as for the Shapiro- Rudin sequence (left to the reader) leads to 

Z((r„)n>o) = X ~^- = L207 - 
which is again the same inconstancy as for a random sequence. 
6.4- Sturmian sequences 

Recall that a Sturmian sequence can be defined as a (binary) sequence having exactly n + 1 blocks of 
length n for every integer n > 1 (see, e.g., p, |34[). In particular Sturmian sequences are not ultimately 
periodic, and the blocks 00 and 11 cannot both occur in a same Sturmian sequence. Since interchanging 
O's and l's in a Sturmian sequence gives a Sturmian sequence, we may suppose that no 11 occurs. But 
then the frequencies of occurrences of the blocks 01 and 10 in the sequence are both equal to the frequency 
of occurrence of 1, hence to the slope of the Sturmian sequence (see, e.g., [H, Theorem 10.5.8, page 318]). 
Thus the inconstancy of a Sturmian sequence of slope a £ (0, 1) without the block 11 in it (resp. of slope 
l-a£ (0, 1) without the block 00 in it) is 

2= 1 + 2(V2- l)a. 

Recall that if the sequence does not contain the block 11, then a belongs to (0, 1/2), hence as expected I 
belongs to (1, y/2). 

Remark 6.1. A possible application of inconstancy of infinite sequences can be to "predict" the nth term 
of a very long (or infinite) sequence knowing its first n — 1 terms: if n is large enough, u n "should" be close 
to a value minimizing the difference |Z(r„) — X(r n _i)|. 

Remark 6.2. A different way of defining the inconstancy of a binary sequence could be to interpret it as 
a sequence on the alphabet {L(eft), R(ight)}. Then to associate with this (LR) sequence a 2D curve drawn 
on the lattice Z 2 , consisting of horizontal and vertical segments. The first segment is (0,0) — (1,0); then for 
each value of the LR sequence we make a ±7r/2 turn. The inconstancy of the sequence could be defined 
as the inconstancy of the curve obtained that way. The reader will have recognized curves studied, e.g., in 



44], where paperfolding sequences enter the picture. This notion of inconstancy for sequences would thus 
be terminologically closer to the "folding index" of (lBj ]. Since the choice of ±7r/2 is arbitrary (another angle 
could have been chosen), it is not clear whether this definition is pertinent or if one should consider all 
possible angles, thus obtaining a set of inconstancies for any given sequence. 
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7. Algorithmic aspects 

In order to compute the inconstancy I(T) := ^yy of a piecewise affine curve T, the perimeter of the 
convex hull of T is needed. Hence we need to construct the convex hull of a finite set consisting of, say, n 
points. Several algorithms are available, their complexity is in Q(n lo gn) (see, e.g., the Graham scan studied 



in |27[, the Jarvis march studied in [29[; see also, e.g., the papers [5lll50| -in particular [50j gives an optimal 
real-time algorithm for planar convex hulls). 

Implementations of these algorithms are classical in usual softwares: for example the command convhull 
in Maple (with the package Convex), the command ConvexHull in Mathematica, the command convexjmll 
in Scilab, or the command convhull (see also convhulln) in Matlab. Also note that Qhull computes 
convex hulls, Delaunay triangulations, Voronoi diagrams, halfspace intersections about a point, furthest-site 
Delaunay triangulations, and furthest-site Voronoi diagrams (see |http : / / www . qhull . org/) . Demonstrations 
of computations can be found on several sites; see e.g., 

http : //www . piler . com/ convexhu.ll/] 



http : / / www . cs . princeton . edu/ courses/archive/f all08/cos226/ demo/ ah/GrahamScan . html 



http://www.cs.princeton.edu/courses/archive/fall08/cos226/demo/ah/ConvexHull.html 
http : //www . cse . unsw . edu . au/ ~lambert / j ava/ 3d/hull . html 



8. Conclusion 

Inspired by the theorem of Cauchy-Crofton, the inconstancy of a curve could be a way of detecting 
large fluctuations of a curve, different from (and hopefully better than) usual indexes such as the residual 
variance. We intend to test this idea in three domains: fluctuations of biological parameters jsij, fluctuations 
of the stockmarket Q and smoothness of musical themes [l| . Two other directions could be the following. 
First, a way of discriminating between models that describe a given phenomenon with the same error bound 
(e.g., prediction of electric load and consumption) could be to choose the model for which the difference 
between data and predictions has maximal inconstancy (when the inconstancy is close to 1, this difference 
is "quasi-afhne" ; this means that there is a "quasi-affine" bias in the model that can/should be corrected a 
priori). Second, we alluded to fractal- like "chaotic" (disordered) curves in the introduction; coming across, 
e.g., the paper 12| we recall that measuring the "complexity" of geographic objects classically involves their 
fractal dimension and, e.g., their "length"; we could also think of looking at their inconstancy (typically how 
complicated a river can be, i.e., how far from straight it looks, can be measured by the number of intersection 
points with a random straight line). A natural question then occurs: to what extent fractal dimension and 
inconstancy are related? Or what can be said of the intersection with straight lines of a set with given fractal 
dimension? Such questions also make sense for (in)finite sequences, in particular in view of Remark 16.21 Of 
course the length of such curves is usually infinite while the length of the convex hull is finite (think of the 
von Koch curve). What could be looked at for fractals obtained by "iteration" is the inconstancy at each 
finite step of the iteration: it is conceivable that the fractal dimension shows up, t houg h this is not the case 
for the von Koch curve. Some ideas about these questions can be found, e.g., in 

[Ea IMS BSE!, 

m 



particular in relation with the entropy of a curve, as discussed in several papers of Mendes France. We will 
conclude this paper with that notion of entropy for a plane curve. Let p n be the probability that a straight 
line cuts the plane curve V in exactly n points, then the theorem of Cauchy-Crofton says that 

2i{T) 



£^»- s(F) 

n>l V ' 



It is natural to define the entropy of T by 

H(T) :=Vp„log— • 



Pn 



Now how large can this expression be? Define the set of sequences V by 

!2£(r) 1 
(Pn)n>i; Pn > 0, y~]pn = l, np n = > , and let H max (T) := maxJJ(r). 

n>l n>l ^ ' \ 
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It can be proven (see [22j for details, also see [39|) that 

2£(T) /3 



ffmax(r) = bg 



S(T) eP-V 



2i(r) 

where (3 := log ^- - (the quantity /3 can be seen as the inverse of the temperature of the curve). 

2€(r) - 5(r) 

A modified definition is thus proposed in [38}, namely 

2i(T) 



H(T) := log 



5(T) 



This definition was used in several papers (see, e.g., [2(| [2l|, @] ) • With our terminology, it reads, as noted 
by Mendes France, "the entropy is the logarithm of the inconstancy" . The reader might think of compar- 
ing this statement with the classical Weber-Fechner law in psychophysics according to which "sensation is 



proportional to the logarithm of excitation" (|23j; see also http://psychclassics.yorku.ca/Fechner/) 
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